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ABSTRACT: 

We propose a new parameterized method for the defuzzification process based on the 
simple M-SLIDE transformation. We develop a computationally efficient algorithm for learning 
the relevant parameter as well as providing a computationally simple scheme for doing the 
defuzzification step in the fuzzy logic controllers. The M-SLIDE method results in a particularly 
simple linear form of the algorithm for learning the parameter which can be used both off and on 
line. 


1. Introduction 

Recently with the intensive development of fuzzy control[l, 2], the problem of selection of 
a crisp representation of a fuzzy set, defuzzification has become one of the most important issues in 
fuzzy logic. In [3, 4] it was shown that the commonly used defuzzification methods, Center of 
Area (COA) and Mean of Maxima (MOM) [1, 2], are only special cases of a more general 
defuzzification method, called Generalized Defuzzification via BAsic Defuzzification Distribution 
(BADD). The BAD Distribution vj, i=(l, n) of a fuzzy set D with membership function 


D(xj) = wj, wj € [0, 1], is derived from its possibility distribution by use of the BADD 
transformation: 


v i 


W; 


a > 0 


w 


a 


(i). 


j=i 

The BADD transformation converts the possibility distribution Wj to a probability distribution Vj, in 


a manner that preserves the features of D, wj > wj => vj > vj and wj = wj => vj = vj. For a =1 
the BADD transformation converts proportionally the possibility distribution wj, i=(l, n) to BAD 
distribution vj, i=(l, n). For a > 1 it discounts the elements of X with lower grade of membership 
wj. Through parameter a the BADD transformation relates the probability distribution v(x) to our 

confidence in the model [3, 4]. An increasing of a is associated with a decrease of uncertainty, 
decreasing of entropy and an increase in confidence. The defuzzified value obtained via the BADD 
approach is defined as the expected value of X over the BAD distribution vj, i=(l, n): 

d BADD = £j^L, a 20 (2) 

i=1 I w“ 
j=l 

It is evident, that for fixed a, the defuzzified value d^ADD > minimizes the mean square error, 
E{(x - d^ADD)2j Thus the BADD defuzzified value is the optimal defuzzified value in the sense 
of minimizing the criterion 
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(3). 


X (xj - d 8 ^ 0 ) 2 ?, 

i 

The main conclusion of this approach was that the best defuzzified value in the sense of 
above criterion can be obtained by adaptation of parameter a by learning. Unfortunately the 
problem of learning the parameter a from a given data set using directly expression (2) is a 
constrained nonlinear programming problem and its solution is difficult in real control applications. 
In this paper we solve the learning problem by the introduction of a new transformation of the 
possibility distribution wj, i=(l, n) to the probability distribution vj, i=(l, n), called the Modified 

SemiLInear DE fuzzification (M-SLIDE) transformation. The introduction of this new 
transformation results in a simple linear expression for the defuzzified value involving one 
parameter. An algorithm for learning the parameter is proposed. 


2. M-SLIDE Defuzzification Technique 


Let the probability distribution uj, i = (1, n) be obtained by the proportional transformation 
(normalization) of wj , 


Uj = C Wj 



i=(l,n). 


(4) 


j=i 

The following transformation of the probability distribution uj, i =(1, n) to a probability 
distribution vj, i =(1, n) is defined as the M-SLIDE transformation: 


v i = 



i[l-(l-P)£ Uj ] if ieM 

j«M 

(l-{3) uj if ieM 


(5) 


where m = card(M) is the cardinality of the set M of elements with maximal membership grades: 

M = {i I wj = Maxj[wj] } 

The derivation of the M-SLIDE transformation is expressed in detail in Yager & Filev [5] 

The following theorem [5] shows some of the significant properties of the probability 
distribution obtained via the M-SLIDE transformation. 

Theorem 1: Let wj, i=(l,n) be the possibility distribution of a given fuzzy set and let vj, i=(l,n) 

be obtained by application of transformations (4) followed by (5). Then it follows: 

i. distribution vj, i=(l,n) is a probability distribution; 


ii. wj = wj => vj = vj ,Vij=(l,n) (identity); 

iii. wj > wj ^ vj>vj , Vi,j=(l,n) (monotonicity) 

iv. p =0 => vj = ~ n ~— , i=(l,n); 

2>i 

j=i 

v. P = 1 => vj = 0, ie M and vj = ^ , ieM. 

An immediate consequence of Theorem 1 is that the entropy of the M-SLIDE Distribution 
vj, is maximal for (3 = 0 and minimal for (3 = 1. 
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When using the M-SLIDE transformation to obtain the probability distribution vj the 
expected value, d, with respect to the elements xj of support set is 

d = 2 v i x i = 0-P) 2 u i x i + m [1 ‘ (1 'P> 2 u il 2 x j 

i = l igM igM jeM 

d = (1-P) 2 u i ( x i - d M0M ) + dMOM 

ieM 

where dMOM is the MOM defuzzified value, 
d MOM = J_ X Xj . 

j€M 

It is evident that expected value d generalizes the MOM defuzzified value. 

Definition 1. The process of selection of a deterministic value from the universe of discourse of 
a given fuzzy set by evaluation of the expected value d is called the Modified Semi Linear 
DEfuzzification (M-SLIDE) Method. The defuzzified value, denoted dMS^ obtained by 
application of the M-SLEDE method is called the M-SLIDE value and is defined as 
dMS = (i_p) £ Ui ( Xi - d M ° M ) + d M ° M - 
it M 

The next Theorem shows the relationship between the M-SLIDE method and the commonly used 
COA and MOM defuzzification methods. 

Theorem 2. The M-SLIDE method reduces to the COA defuzzification method for P = 0 and to 
the MOM defuzzification method for P = 1 . 

Proof. For (3 = 0 

d MS = £ ui xj + i m u max X x j = 2 c w i ^ + c w max £ 

i«M jeM ieM jeM 

dMS = IT 1 — [2 Wi Xi + w max X x ] = d COA 
£ Wj icM jeM 

j=l 

where by d^OA we denote the defuzzified valued obtained by the COA defuzzification method. 

For p = 1, d MS = d M0M . 

Theorem 3. The following expressions of the M-SLIDE defuzzified value, d^S, are equivalent: 
dMS = (i-p) £ Ui ( Xi - dMOM) + d MOM 

it M 

d MS = p £ ui (d M0M - xj) + d C0A 

ig M 

d MS = p d MOM + (i_ p) d COA 
d MS — p ( d MOM . d COA) + d COA 
Proof. d MS - (i-p) £ Ui ( Xi . d MOM) + d MOM 

it M 
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= P S u i (d M0M * Xi) + X u i ( x i - d M0M ) + dMOM 

ieM ie M 

= P X u i (dMOM ' x i) + X u i x i ' X u i dMOM + dMOM 

ie M ie M ie M 

= P X u i (d M0M - Xi) + X Ui Xi - (1 - m Umax) d M OM + d MOM 
ieM ieM 

d MS = p X u i (d MOM - Xi) + d COA 
ieM 

= P X u i d^OM . p ^ ui xi + dCOA 

ie M ie M 

= P (1 - m Umax) d MOM - P X u i x i + d COA 

ieM 

= P dMOM . P m Umax ^ X x i • p X u i x i + dCOA 

ieM ieM 

= P d MOM . p d COA + d COA 

dMS = p d MOM + (l- p) d COA _ p ( d MOM . d COA) + d COA 
Theorem 3 provides convenient forms for the M-SLIDE defuzzified value as a linear 
function of the parameter p. In the next section we will use these forms for estimation of the 
parameter P in a learning procedure, capable of working on line. 

3. Algorithm for Learning the M-SLIDE Parameter 

In this section we solve the problem of learning the parameter P of the M-SLIDE method 
from a given sequence of fuzzy sets and desired defuzzified values. Furthermore we demonstrate 
that the M-SLIDE method can be used as an approximation of the Generalized Defuzzification 
Method via the BAD Distribution [3]. 

Assume we are given a collection of fuzzy sets and the desired defuzzified values d^, 
k = (1, K). We denote by d£ 1 ° M and d£° A the defuzzified values of the fuzzy sets under 
MOM and COA defuzzification methods. The problem of learning of the parameter P is equivalent 
to the recursive solution of the set of linear equations: P * (dj^ OM - d£° A ) + d£° A = d^ , k = (1, 
K). 

For simplification we denote: c k = djf OM - d k OA and y k = d k - d k 0A and rewrite the set of equations 

that has to be solved in the form: P = yk for k = (1 , K). 

In general there is no guarantee that this set of equations can be exactly satisfied for some 
value of p and also that c^ doesn't vanish for some k. For this reason we seek a least squares 
solution of the set of equations under the assumption of noisy observation data. The solution of 
this classical mathematical problem can be obtained by the application of a number of different 
techniques . In this paper we shall use an algorithm that is a deterministic version of the well 
known Kalman filter [6] which is usually used to solve the same kind least squares of errors 
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estimation problem for the case of dynamic systems. 

The unknown parameter p that has to be estimated is regarded as a state vector of a 
hypothetical autonomous scalar dynamic system driven by the equations 

Pk+1 = Pk ^ yk = c k p k + £ k 

where the term £ k denotes Gaussian white noise with covariance i^. Then the recursive Kalman 
filter that gives the best estimate of the state vector P k of this system has the form [6]: 


Pk/k = Pk/k-1 + gk (Yk • c k Pk/k-l) i 

/s /s 

Pk+ 1/k = Pk/k ii 

Pk/k-1 = Pk-l/k-1 iii 

gk = Pk/k- 1 c k 1 iv 

Ck Pk/k-1 "*■ tic 

Pk/k = Pk/k- 1 -gk c k Pk/k-1 v 


Roughly speaking the Kalman filter calculates at every step the best estimate of the state vector as a 

xs» 

sum of the prediction of P at step k from its value at step k-1, Pic/jc-i, and a correction term 

proportional to the difference between current output value y k and predicted output c k Pit/ic-i- 
Equation iv calculates the varying gain, g k , of the filter. The evolution of error covariance is given 


by equation v. Because of the static nature of the autonomous system P k +i/ k = P^ = P k and 
Pk/k-1 = Pk-l/k-1 = Pk-1 significantly simplifies the algorithm to 

Pk= Pk- 1 + gk (yk - ck Pk- 1 ) (vi) 

gk = Pk- 1 c k ~z — 1 (vii) 

Ck Pk-i + r k 

Pk = Pk-1 ' gk c k Pk-1 (viii) 

by combining vi and vii a more compact form of the algorithm is obtained 

Pk= Pk- 1 + Pk- 1 c k -z — - (yk * c k Pk- 1 ) (ix) 

Ck Pk-1 + tk 

Pk = Pk-1 - Pk-i 4 -r — 1 (x) 

Cfc Pk-i + rk 


Because usually we have no idea about the magnitude of the additive noise £ k we shall 
consider r k = 1. Then equation (x) is further simplified and we receive the following final form of 
the Kalman filter algorithm for recursive least square solution of the original set of equations : 

Pk= Pk-1 + 2 P ---~ k (yk - ck Pk-l) xi 

c£ p k .i + 1 

p k = P*Li xii 

Ck Pk-i + 1 

Regarding the initial conditions, it can be argued [7] that a reasonable assumption is to 
consider (3g = 0 and nonnegative pg. 
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The algorithm gives an unconstrained solution for (3. Because of the requirement of (3 

belonging to the unit interval, we shall restrict the solution P k by applying a threshold to give the 
* 

value (3k where 



1 ^ Pk-i + A k > 1 

0 if Pk_i + Ak < 0 
Pk-i + Ak otherwise 


where A k denotes the second term in the right part of xi, 

Ak = Pk-l Ck . (yk . ^ p k j). 

Cfc Pk-i + 1 

The thresholding effect can be replaced by the following nonlinear expression: 

Pk = 1 - 0.5 [1 - 0.5 Ok.! + Ak+iPk-i + A k l) + 11-0.5 (pk-i + A k + ip k -i + A k l) I ] 
The algorithm for learning the M-SLIDE parameter, based on Kalman filter, can now be 
summarized in the following. 

Algorithm for learning the parameter B (M-SLIDE Learning Algorithm) 


1. Set Pq = 0; po > 0. 

2. Read a sample pair Uk, d k . 

3. Calculate: i. d£ 1 ° M ; ii. d£° A ; 


iii. c k = 


jMOM 

a k 


- dk° A ; iv. y k = d k - d 


COA 

k 


4. Update p k , p k : P k = p k _x + ? kl Ck - (y k - c k p k _i) and p k = — PfcJ 

c k Pk-l + 1 C k Pk-i + 1 

* 

5. Calculate Pk : 


Pk = 1 - 0.5 [1 - 0.5 (P k .! + Ak+iPk-i + A k l) + II - 0.5 (Pk-i + A k + lp k l + A k l) I ] 

* 

6, Update the current estimate of the parameter P: P = Pk- 

We note that since the estimate of the parameter P is determined sequentially there is no 
need to resolve the whole set of equations when a new pair of data pair (U k+ j, d k +j) becomes 
available for learning. The addition of a new data pair can be incorporated by just an additional 
iteration of the algorithm. This property of the algorithm allows it to be used for either off-line or 

on-line learning of the parameter p. 

In the case when the desired defuzzified values, the d k 's, are the defuzzified values 
obtained from the defuzzification method using the BADD distribution, the Algorithm can be used 

to get an associated M-SLIDE parameter P corresponding to a BADD transformation parameter a. 

The next example presents an application of the M-SLIDE learning algorithm. 

Example. Assume our data consists of 10 fuzzy sets: 

U X = {0/3, 0.6/4, 1/5, .8/6, 0.9/7, 0/8}; U 2 = {0/5, 0.9/7, 1/9, 1/11, 0.2/12, 0/13}; 
U 3 = {0/2, 0.4/3, 0.8/4, 1/5, 0.5/6, 0/7}; U 4 = {0/4, 1/5, 0.9/6, 1/7, 0.9/8, 0/9}; 
U 5 = {0/6, 0.3/7, 1/8, 0.6/9, 1/10, 0/11}; U 6 = {0/3, 0.2/4, 0.9/7, 1/9, 1/10, 0/12}; 
U 7 = {0/1, 0.9/4, 0.5/5, 1/7, 0.4/8, 0/10}; Ug = {0/3, 0.5/7, 0.9/10, 1/11, 0.4/14, 0/16); 
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U 9 = {0/5, 0.2/6, 1/7, 1/9, 0.1/10, 0/11}; U 10 = {0/4, 1/7, 0.5/8, 1/9, 0.7/10, 0/11). 

We used the BADD defuzzification method to generate the ideal defuzzified values, d^, 
associated with each of these fuzzy sets. In this way we formed six different data sets, each 
consisting of 10 pairs (Ujj, d^) In each data set the d^'s where generated by a different BADD 

parameter a. 

For each data set, using the M-SLIDE learning algorithm, we obtained the optimal estimate 
for the parameter J3. The following tables show the results of the experimentation with our 
algorithm. In the tables below we note that dk is the ideal value and d£ is the calculated 
defuzzification value using the M-SLIDE defuzzification procedure with the optimal estimated P 
parameter for that data set. 


DATA 

SET # 1 

OPTIMAL 

ESTIMATED p = 

0.00022 




k 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

dk 

5.60 

9.26 

4.59 

6.47 

8.79 

8.42 

5.82 

10.39 

7.91 

8.43 

d k 

5.60 

9.26 

4.59 

6.47 

8.79 

8.42 

5.82 

10.39 

7.91 

8.43 

DATA 

SET # 2 

OPTIMAL 

ESTIMATED (3 = 

0.10758 




k 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

d c k 

5.54 

9.34 

4.64 

6.42 

8.82 

8.54 

5.95 

10.46 

7.92 

8.39 

d k 

5.71 

9.21 

4.70 

6.42 

8.98 

8.82 

5.76 

10.46 

7.99 

8.28 

DATA 

SET #3 

OPTIMAL 

ESTIMATED |3 = 

0.22539 




k 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

d c k 

5.47 

9.43 

4.68 

6.37 

8.84 

8.66 

6.09 

10.53 

7.93 

8.34 

d k 

5.72 

9.32 

4.77 

6.37 

9.00 

8.93 

5.88 

10.58 

8.00 

8.15 

DATA 

SET #4 

OPTIMAL 

ESTIMATED p = 

0.66891 




k 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

d c k 

5.20 

9.75 

4.87 

6.16 

8.93 

9.14 

6.61 

10.80 

7.97 

8.14 

d k 

5.36 

9.72 

4.97 

6.17 

9.00 

9.27 

6.49 

10.83 

8.00 

8.00 

DATA 

SET # 5 

OPTIMAL 

ESTIMATED P = 

0.92394 




k 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

d c k 

5.05 

9.94 

4.97 

6.04 

8.98 

9.42 

6.91 

10.95 

7.99 

8.03 

d k 

5.08 

9.94 

5.00 

6.04 

9.00 

9.45 

6.88 

10.96 

8.00 

8.00 
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DATA SET # 6 OPTIMAL ESTIMATED (5 = 0.97293 

k 123456789 10 

d£ 5.02 9.98 4.99 6.01 8.99 9.47 6.97 10.98 8.00 8.01 

d k 5.03 9.98 5.00 6.01 9.00 9.48 6.96 10.99 8.00 8.00 

It is can be seen from the above example that the M-SLIDE learning algorithm learns values of the 
parameter (3 that allow a very good matching of the data set. 
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